Overview

Brought to you by YData

Dataset statistics

Number of variables14
Number of observations180000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory75.3 MiB
Average record size in memory438.8 B

Variable types

Numeric8
Categorical5
Boolean1

Alerts

cb_person_cred_hist_length is highly overall correlated with person_age and 1 other fieldsHigh correlation
loan_amnt is highly overall correlated with loan_percent_incomeHigh correlation
loan_percent_income is highly overall correlated with loan_amntHigh correlation
loan_status is highly overall correlated with previous_loan_defaults_on_fileHigh correlation
person_age is highly overall correlated with cb_person_cred_hist_length and 1 other fieldsHigh correlation
person_emp_exp is highly overall correlated with cb_person_cred_hist_length and 1 other fieldsHigh correlation
previous_loan_defaults_on_file is highly overall correlated with loan_statusHigh correlation
person_income is highly skewed (γ1 = 34.13672968) Skewed
person_id is uniformly distributed Uniform
person_id has unique values Unique
person_emp_exp has 38264 (21.3%) zeros Zeros

Reproduction

Analysis started2024-12-20 03:21:24.127755
Analysis finished2024-12-20 03:21:57.225356
Duration33.1 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

person_id
Real number (ℝ)

Uniform  Unique 

Distinct180000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean90000.5
Minimum1
Maximum180000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-12-20T03:21:57.687353image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9000.95
Q145000.75
median90000.5
Q3135000.25
95-th percentile171000.05
Maximum180000
Range179999
Interquartile range (IQR)89999.5

Descriptive statistics

Standard deviation51961.669
Coefficient of variation (CV)0.57734867
Kurtosis-1.2
Mean90000.5
Median Absolute Deviation (MAD)45000
Skewness0
Sum1.620009 × 1010
Variance2.700015 × 109
MonotonicityStrictly increasing
2024-12-20T03:21:58.034468image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
120004 1
 
< 0.1%
119996 1
 
< 0.1%
119997 1
 
< 0.1%
119998 1
 
< 0.1%
119999 1
 
< 0.1%
120000 1
 
< 0.1%
120001 1
 
< 0.1%
120002 1
 
< 0.1%
120003 1
 
< 0.1%
Other values (179990) 179990
> 99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
180000 1
< 0.1%
179999 1
< 0.1%
179998 1
< 0.1%
179997 1
< 0.1%
179996 1
< 0.1%
179995 1
< 0.1%
179994 1
< 0.1%
179993 1
< 0.1%
179992 1
< 0.1%
179991 1
< 0.1%

person_age
Real number (ℝ)

High correlation 

Distinct60
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.764178
Minimum20
Maximum144
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-12-20T03:21:58.374420image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile22
Q124
median26
Q330
95-th percentile39
Maximum144
Range124
Interquartile range (IQR)6

Descriptive statistics

Standard deviation6.0450578
Coefficient of variation (CV)0.21772868
Kurtosis18.647795
Mean27.764178
Median Absolute Deviation (MAD)3
Skewness2.5480903
Sum4997552
Variance36.542724
MonotonicityNot monotonic
2024-12-20T03:21:58.900124image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23 21016
11.7%
24 20552
11.4%
25 18028
10.0%
22 16944
9.4%
26 14636
 
8.1%
27 12380
 
6.9%
28 10912
 
6.1%
29 9820
 
5.5%
30 8084
 
4.5%
31 6580
 
3.7%
Other values (50) 41048
22.8%
ValueCountFrequency (%)
20 68
 
< 0.1%
21 5156
 
2.9%
22 16944
9.4%
23 21016
11.7%
24 20552
11.4%
25 18028
10.0%
26 14636
8.1%
27 12380
6.9%
28 10912
6.1%
29 9820
5.5%
ValueCountFrequency (%)
144 12
< 0.1%
123 8
< 0.1%
116 4
 
< 0.1%
109 4
 
< 0.1%
94 4
 
< 0.1%
84 4
 
< 0.1%
80 4
 
< 0.1%
78 4
 
< 0.1%
76 4
 
< 0.1%
73 12
< 0.1%

person_gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.6 MiB
male
99364 
female
80636 

Length

Max length6
Median length4
Mean length4.8959556
Min length4

Characters and Unicode

Total characters881272
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfemale
2nd rowfemale
3rd rowfemale
4th rowfemale
5th rowmale

Common Values

ValueCountFrequency (%)
male 99364
55.2%
female 80636
44.8%

Length

2024-12-20T03:21:59.362970image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-20T03:21:59.803358image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
male 99364
55.2%
female 80636
44.8%

Most occurring characters

ValueCountFrequency (%)
e 260636
29.6%
m 180000
20.4%
a 180000
20.4%
l 180000
20.4%
f 80636
 
9.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 881272
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 260636
29.6%
m 180000
20.4%
a 180000
20.4%
l 180000
20.4%
f 80636
 
9.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 881272
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 260636
29.6%
m 180000
20.4%
a 180000
20.4%
l 180000
20.4%
f 80636
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 881272
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 260636
29.6%
m 180000
20.4%
a 180000
20.4%
l 180000
20.4%
f 80636
 
9.1%

person_education
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.3 MiB
Bachelor
53596 
Associate
48112 
High School
47888 
Master
27920 
Doctorate
 
2484

Length

Max length11
Median length9
Mean length8.769
Min length6

Characters and Unicode

Total characters1578420
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMaster
2nd rowHigh School
3rd rowHigh School
4th rowBachelor
5th rowMaster

Common Values

ValueCountFrequency (%)
Bachelor 53596
29.8%
Associate 48112
26.7%
High School 47888
26.6%
Master 27920
15.5%
Doctorate 2484
 
1.4%

Length

2024-12-20T03:22:00.165099image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-20T03:22:00.630543image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
bachelor 53596
23.5%
associate 48112
21.1%
high 47888
21.0%
school 47888
21.0%
master 27920
12.3%
doctorate 2484
 
1.1%

Most occurring characters

ValueCountFrequency (%)
o 202452
12.8%
c 152080
9.6%
h 149372
9.5%
e 132112
8.4%
a 132112
8.4%
s 124144
 
7.9%
l 101484
 
6.4%
i 96000
 
6.1%
r 84000
 
5.3%
t 81000
 
5.1%
Other values (8) 323664
20.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1578420
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 202452
12.8%
c 152080
9.6%
h 149372
9.5%
e 132112
8.4%
a 132112
8.4%
s 124144
 
7.9%
l 101484
 
6.4%
i 96000
 
6.1%
r 84000
 
5.3%
t 81000
 
5.1%
Other values (8) 323664
20.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1578420
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 202452
12.8%
c 152080
9.6%
h 149372
9.5%
e 132112
8.4%
a 132112
8.4%
s 124144
 
7.9%
l 101484
 
6.4%
i 96000
 
6.1%
r 84000
 
5.3%
t 81000
 
5.1%
Other values (8) 323664
20.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1578420
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 202452
12.8%
c 152080
9.6%
h 149372
9.5%
e 132112
8.4%
a 132112
8.4%
s 124144
 
7.9%
l 101484
 
6.4%
i 96000
 
6.1%
r 84000
 
5.3%
t 81000
 
5.1%
Other values (8) 323664
20.5%

person_income
Real number (ℝ)

Skewed 

Distinct33989
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80319.053
Minimum8000
Maximum7200766
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-12-20T03:22:01.111733image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum8000
5-th percentile28366.7
Q147204
median67048
Q395789.25
95-th percentile166754.7
Maximum7200766
Range7192766
Interquartile range (IQR)48585.25

Descriptive statistics

Standard deviation80421.828
Coefficient of variation (CV)1.0012796
Kurtosis2398.4848
Mean80319.053
Median Absolute Deviation (MAD)23124
Skewness34.13673
Sum1.445743 × 1010
Variance6.4676705 × 109
MonotonicityNot monotonic
2024-12-20T03:22:01.738963image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8000 60
 
< 0.1%
73011 40
 
< 0.1%
36995 36
 
< 0.1%
60914 32
 
< 0.1%
37020 32
 
< 0.1%
73082 28
 
< 0.1%
60864 28
 
< 0.1%
67131 28
 
< 0.1%
72951 28
 
< 0.1%
73040 28
 
< 0.1%
Other values (33979) 179660
99.8%
ValueCountFrequency (%)
8000 60
< 0.1%
8037 4
 
< 0.1%
8104 4
 
< 0.1%
8186 4
 
< 0.1%
8248 4
 
< 0.1%
8267 4
 
< 0.1%
8277 4
 
< 0.1%
8302 4
 
< 0.1%
8518 4
 
< 0.1%
9364 4
 
< 0.1%
ValueCountFrequency (%)
7200766 4
< 0.1%
5556399 4
< 0.1%
5545545 4
< 0.1%
2448661 4
< 0.1%
2280980 4
< 0.1%
2139143 4
< 0.1%
2012954 4
< 0.1%
1741243 4
< 0.1%
1728974 4
< 0.1%
1661567 4
< 0.1%

person_emp_exp
Real number (ℝ)

High correlation  Zeros 

Distinct63
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4103333
Minimum0
Maximum125
Zeros38264
Zeros (%)21.3%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-12-20T03:22:02.364213image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median4
Q38
95-th percentile17
Maximum125
Range125
Interquartile range (IQR)7

Descriptive statistics

Standard deviation6.0634816
Coefficient of variation (CV)1.1207224
Kurtosis19.166626
Mean5.4103333
Median Absolute Deviation (MAD)3
Skewness2.5948525
Sum973860
Variance36.765809
MonotonicityNot monotonic
2024-12-20T03:22:02.802821image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 38264
21.3%
2 16536
9.2%
1 16244
9.0%
3 15560
8.6%
4 14096
 
7.8%
5 12000
 
6.7%
6 10868
 
6.0%
7 8816
 
4.9%
8 7560
 
4.2%
9 6300
 
3.5%
Other values (53) 33756
18.8%
ValueCountFrequency (%)
0 38264
21.3%
1 16244
9.0%
2 16536
9.2%
3 15560
8.6%
4 14096
 
7.8%
5 12000
 
6.7%
6 10868
 
6.0%
7 8816
 
4.9%
8 7560
 
4.2%
9 6300
 
3.5%
ValueCountFrequency (%)
125 4
< 0.1%
124 4
< 0.1%
121 4
< 0.1%
101 4
< 0.1%
100 4
< 0.1%
93 4
< 0.1%
85 4
< 0.1%
76 4
< 0.1%
62 4
< 0.1%
61 4
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.7 MiB
RENT
93772 
MORTGAGE
73956 
OWN
11804 
OTHER
 
468

Length

Max length8
Median length4
Mean length5.5804889
Min length3

Characters and Unicode

Total characters1004488
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRENT
2nd rowOWN
3rd rowMORTGAGE
4th rowRENT
5th rowRENT

Common Values

ValueCountFrequency (%)
RENT 93772
52.1%
MORTGAGE 73956
41.1%
OWN 11804
 
6.6%
OTHER 468
 
0.3%

Length

2024-12-20T03:22:03.130280image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-20T03:22:03.379681image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
rent 93772
52.1%
mortgage 73956
41.1%
own 11804
 
6.6%
other 468
 
0.3%

Most occurring characters

ValueCountFrequency (%)
R 168196
16.7%
E 168196
16.7%
T 168196
16.7%
G 147912
14.7%
N 105576
10.5%
O 86228
8.6%
M 73956
7.4%
A 73956
7.4%
W 11804
 
1.2%
H 468
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1004488
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
R 168196
16.7%
E 168196
16.7%
T 168196
16.7%
G 147912
14.7%
N 105576
10.5%
O 86228
8.6%
M 73956
7.4%
A 73956
7.4%
W 11804
 
1.2%
H 468
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1004488
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
R 168196
16.7%
E 168196
16.7%
T 168196
16.7%
G 147912
14.7%
N 105576
10.5%
O 86228
8.6%
M 73956
7.4%
A 73956
7.4%
W 11804
 
1.2%
H 468
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1004488
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
R 168196
16.7%
E 168196
16.7%
T 168196
16.7%
G 147912
14.7%
N 105576
10.5%
O 86228
8.6%
M 73956
7.4%
A 73956
7.4%
W 11804
 
1.2%
H 468
 
< 0.1%

cb_person_cred_hist_length
Real number (ℝ)

High correlation 

Distinct29
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.8674889
Minimum2
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-12-20T03:22:03.623182image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q13
median4
Q38
95-th percentile14
Maximum30
Range28
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.8796695
Coefficient of variation (CV)0.66121463
Kurtosis3.725534
Mean5.8674889
Median Absolute Deviation (MAD)2
Skewness1.6316792
Sum1056148
Variance15.051836
MonotonicityNot monotonic
2024-12-20T03:22:03.892898image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
4 34612
19.2%
3 33248
18.5%
2 26148
14.5%
5 12328
 
6.8%
6 11864
 
6.6%
7 11556
 
6.4%
8 11200
 
6.2%
9 10740
 
6.0%
10 9828
 
5.5%
12 2860
 
1.6%
Other values (19) 15616
8.7%
ValueCountFrequency (%)
2 26148
14.5%
3 33248
18.5%
4 34612
19.2%
5 12328
 
6.8%
6 11864
 
6.6%
7 11556
 
6.4%
8 11200
 
6.2%
9 10740
 
6.0%
10 9828
 
5.5%
11 2848
 
1.6%
ValueCountFrequency (%)
30 92
0.1%
29 60
< 0.1%
28 116
0.1%
27 92
0.1%
26 80
< 0.1%
25 92
0.1%
24 136
0.1%
23 104
0.1%
22 128
0.1%
21 96
0.1%

loan_amnt
Real number (ℝ)

High correlation 

Distinct4483
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9583.1576
Minimum500
Maximum35000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-12-20T03:22:04.215844image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum500
5-th percentile2000
Q15000
median8000
Q312237.25
95-th percentile24000
Maximum35000
Range34500
Interquartile range (IQR)7237.25

Descriptive statistics

Standard deviation6314.8341
Coefficient of variation (CV)0.65895129
Kurtosis1.3510026
Mean9583.1576
Median Absolute Deviation (MAD)3800
Skewness1.1797018
Sum1.7249684 × 109
Variance39877129
MonotonicityNot monotonic
2024-12-20T03:22:04.546310image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000 14468
 
8.0%
5000 11148
 
6.2%
6000 9704
 
5.4%
12000 9664
 
5.4%
15000 8016
 
4.5%
8000 7712
 
4.3%
4000 5624
 
3.1%
20000 5540
 
3.1%
3000 5512
 
3.1%
7000 5256
 
2.9%
Other values (4473) 97356
54.1%
ValueCountFrequency (%)
500 20
< 0.1%
563 4
 
< 0.1%
700 4
 
< 0.1%
725 4
 
< 0.1%
750 4
 
< 0.1%
800 4
 
< 0.1%
900 8
 
< 0.1%
912 4
 
< 0.1%
922 4
 
< 0.1%
950 4
 
< 0.1%
ValueCountFrequency (%)
35000 936
0.5%
34826 4
 
< 0.1%
34800 4
 
< 0.1%
34664 4
 
< 0.1%
34375 4
 
< 0.1%
34322 4
 
< 0.1%
34121 4
 
< 0.1%
34000 16
 
< 0.1%
33950 8
 
< 0.1%
33800 4
 
< 0.1%

loan_intent
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.5 MiB
EDUCATION
36612 
MEDICAL
34192 
VENTURE
31276 
PERSONAL
30208 
DEBTCONSOLIDATION
28580 

Length

Max length17
Median length15
Mean length10.012711
Min length7

Characters and Unicode

Total characters1802288
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPERSONAL
2nd rowEDUCATION
3rd rowMEDICAL
4th rowMEDICAL
5th rowMEDICAL

Common Values

ValueCountFrequency (%)
EDUCATION 36612
20.3%
MEDICAL 34192
19.0%
VENTURE 31276
17.4%
PERSONAL 30208
16.8%
DEBTCONSOLIDATION 28580
15.9%
HOMEIMPROVEMENT 19132
10.6%

Length

2024-12-20T03:22:04.849607image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-20T03:22:05.121549image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
education 36612
20.3%
medical 34192
19.0%
venture 31276
17.4%
personal 30208
16.8%
debtconsolidation 28580
15.9%
homeimprovement 19132
10.6%

Most occurring characters

ValueCountFrequency (%)
E 249540
13.8%
O 190824
10.6%
N 174388
9.7%
I 147096
8.2%
T 144180
8.0%
A 129592
 
7.2%
D 127964
 
7.1%
C 99384
 
5.5%
L 92980
 
5.2%
M 91588
 
5.1%
Other values (7) 354752
19.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1802288
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 249540
13.8%
O 190824
10.6%
N 174388
9.7%
I 147096
8.2%
T 144180
8.0%
A 129592
 
7.2%
D 127964
 
7.1%
C 99384
 
5.5%
L 92980
 
5.2%
M 91588
 
5.1%
Other values (7) 354752
19.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1802288
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 249540
13.8%
O 190824
10.6%
N 174388
9.7%
I 147096
8.2%
T 144180
8.0%
A 129592
 
7.2%
D 127964
 
7.1%
C 99384
 
5.5%
L 92980
 
5.2%
M 91588
 
5.1%
Other values (7) 354752
19.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1802288
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 249540
13.8%
O 190824
10.6%
N 174388
9.7%
I 147096
8.2%
T 144180
8.0%
A 129592
 
7.2%
D 127964
 
7.1%
C 99384
 
5.5%
L 92980
 
5.2%
M 91588
 
5.1%
Other values (7) 354752
19.7%

loan_int_rate
Real number (ℝ)

Distinct1302
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.006606
Minimum5.42
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-12-20T03:22:05.455949image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum5.42
5-th percentile6.17
Q18.59
median11.01
Q312.99
95-th percentile16
Maximum20
Range14.58
Interquartile range (IQR)4.4

Descriptive statistics

Standard deviation2.9787835
Coefficient of variation (CV)0.27063597
Kurtosis-0.42040028
Mean11.006606
Median Absolute Deviation (MAD)2.13
Skewness0.21377873
Sum1981189
Variance8.8731509
MonotonicityNot monotonic
2024-12-20T03:22:05.796737image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.01 13316
 
7.4%
10.99 3216
 
1.8%
7.51 3192
 
1.8%
7.49 2748
 
1.5%
7.88 2692
 
1.5%
5.42 2432
 
1.4%
7.9 2424
 
1.3%
11.49 2056
 
1.1%
9.99 1936
 
1.1%
13.49 1900
 
1.1%
Other values (1292) 144088
80.0%
ValueCountFrequency (%)
5.42 2432
1.4%
5.43 8
 
< 0.1%
5.44 8
 
< 0.1%
5.46 4
 
< 0.1%
5.47 20
 
< 0.1%
5.48 16
 
< 0.1%
5.49 16
 
< 0.1%
5.5 4
 
< 0.1%
5.51 12
 
< 0.1%
5.52 8
 
< 0.1%
ValueCountFrequency (%)
20 336
0.2%
19.91 36
 
< 0.1%
19.9 4
 
< 0.1%
19.82 20
 
< 0.1%
19.8 4
 
< 0.1%
19.79 16
 
< 0.1%
19.74 16
 
< 0.1%
19.69 48
 
< 0.1%
19.66 12
 
< 0.1%
19.62 4
 
< 0.1%

loan_percent_income
Real number (ℝ)

High correlation 

Distinct64
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.13972489
Minimum0
Maximum0.66
Zeros108
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-12-20T03:22:06.126874image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.03
Q10.07
median0.12
Q30.19
95-th percentile0.31
Maximum0.66
Range0.66
Interquartile range (IQR)0.12

Descriptive statistics

Standard deviation0.087211581
Coefficient of variation (CV)0.6241664
Kurtosis1.082226
Mean0.13972489
Median Absolute Deviation (MAD)0.05
Skewness1.0344863
Sum25150.48
Variance0.0076058599
MonotonicityNot monotonic
2024-12-20T03:22:06.456948image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.08 10372
 
5.8%
0.1 9684
 
5.4%
0.07 9660
 
5.4%
0.09 9180
 
5.1%
0.06 8968
 
5.0%
0.12 8864
 
4.9%
0.05 8704
 
4.8%
0.11 8632
 
4.8%
0.14 7840
 
4.4%
0.04 7800
 
4.3%
Other values (54) 90296
50.2%
ValueCountFrequency (%)
0 108
 
0.1%
0.01 1260
 
0.7%
0.02 3776
 
2.1%
0.03 5952
3.3%
0.04 7800
4.3%
0.05 8704
4.8%
0.06 8968
5.0%
0.07 9660
5.4%
0.08 10372
5.8%
0.09 9180
5.1%
ValueCountFrequency (%)
0.66 4
 
< 0.1%
0.63 4
 
< 0.1%
0.62 8
 
< 0.1%
0.61 8
 
< 0.1%
0.59 4
 
< 0.1%
0.58 4
 
< 0.1%
0.57 4
 
< 0.1%
0.56 20
< 0.1%
0.55 20
< 0.1%
0.54 32
< 0.1%

loan_status
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.0 MiB
0
140000 
1
40000 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters180000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 140000
77.8%
1 40000
 
22.2%

Length

2024-12-20T03:22:06.759980image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-20T03:22:07.001884image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 140000
77.8%
1 40000
 
22.2%

Most occurring characters

ValueCountFrequency (%)
0 140000
77.8%
1 40000
 
22.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 180000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 140000
77.8%
1 40000
 
22.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 180000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 140000
77.8%
1 40000
 
22.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 180000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 140000
77.8%
1 40000
 
22.2%

previous_loan_defaults_on_file
Boolean

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size175.9 KiB
True
91432 
False
88568 
ValueCountFrequency (%)
True 91432
50.8%
False 88568
49.2%
2024-12-20T03:22:07.237647image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Interactions

2024-12-20T03:21:53.179973image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:36.565956image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:38.695931image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:40.983680image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:43.153279image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:45.620488image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:48.752492image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:50.944397image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:53.441510image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:36.845669image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:38.980224image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:41.280832image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:43.412069image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:45.978337image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:49.016864image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:51.240218image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:53.710647image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:37.106498image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:39.246914image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:41.535841image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:43.665324image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:46.377919image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:49.308136image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:51.518178image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:53.976526image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:37.362263image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:39.508425image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:41.796136image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:43.929235image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:46.774438image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:49.584996image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:51.787197image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:54.274083image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:37.614795image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:39.759136image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:42.085192image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:44.230697image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:47.136351image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:49.850269image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:52.070689image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:54.526658image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:37.870844image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:40.195291image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:42.348243image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:44.474710image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:47.489426image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:50.118281image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:52.334892image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:54.800370image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:38.180161image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:40.455459image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:42.604341image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:44.812749image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:48.140339image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:50.397947image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:52.612647image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:55.091163image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:38.437532image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:40.710456image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:42.865504image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:45.238570image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:48.500359image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:50.663308image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-20T03:21:52.896331image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Correlations

2024-12-20T03:22:07.719131image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
cb_person_cred_hist_lengthloan_amntloan_int_rateloan_intentloan_percent_incomeloan_statusperson_ageperson_educationperson_emp_expperson_genderperson_home_ownershipperson_idperson_incomeprevious_loan_defaults_on_file
cb_person_cred_hist_length1.0000.0430.0170.055-0.0370.0240.8210.0920.7500.0290.0300.1280.0930.029
loan_amnt0.0431.0000.1050.0330.6660.1260.0640.0120.0520.0130.0910.0170.4050.068
loan_int_rate0.0170.1051.0000.0210.1240.3630.0130.0130.0160.0080.0850.005-0.0330.198
loan_intent0.0550.0330.0211.0000.0220.1420.0320.0150.0310.0050.0830.0320.0130.081
loan_percent_income-0.0370.6660.1240.0221.0000.415-0.0560.011-0.0500.0090.092-0.002-0.3530.220
loan_status0.0240.1260.3630.1420.4151.0000.0170.0050.0180.0000.2580.0930.0130.543
person_age0.8210.0640.0130.032-0.0560.0171.0000.0610.8880.0260.0190.1220.1430.032
person_education0.0920.0120.0130.0150.0110.0050.0611.0000.0660.0020.0100.0410.0100.041
person_emp_exp0.7500.0520.0160.031-0.0500.0180.8880.0661.0000.0240.0150.1010.1200.031
person_gender0.0290.0130.0080.0050.0090.0000.0260.0020.0241.0000.0000.0000.0130.000
person_home_ownership0.0300.0910.0850.0830.0920.2580.0190.0100.0150.0001.0000.0610.0120.140
person_id0.1280.0170.0050.032-0.0020.0930.1220.0410.1010.0000.0611.0000.0250.037
person_income0.0930.405-0.0330.013-0.3530.0130.1430.0100.1200.0130.0120.0251.0000.012
previous_loan_defaults_on_file0.0290.0680.1980.0810.2200.5430.0320.0410.0310.0000.1400.0370.0121.000

Missing values

2024-12-20T03:21:55.547154image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-12-20T03:21:56.326887image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

person_idperson_ageperson_genderperson_educationperson_incomeperson_emp_expperson_home_ownershipcb_person_cred_hist_lengthloan_amntloan_intentloan_int_rateloan_percent_incomeloan_statusprevious_loan_defaults_on_file
0122femaleMaster71948.00.0RENT3.035000.0PERSONAL16.020.491No
1221femaleHigh School12282.00.0OWN2.01000.0EDUCATION11.140.080Yes
2325femaleHigh School12438.03.0MORTGAGE3.05500.0MEDICAL12.870.441No
3423femaleBachelor79753.00.0RENT2.035000.0MEDICAL15.230.441No
4524maleMaster66135.01.0RENT4.035000.0MEDICAL14.270.531No
5621femaleHigh School12951.00.0OWN2.02500.0VENTURE7.140.191No
6726femaleBachelor93471.01.0RENT3.035000.0EDUCATION12.420.371No
7824femaleHigh School95550.05.0RENT4.035000.0MEDICAL11.110.371No
8924femaleAssociate100684.03.0RENT2.035000.0PERSONAL8.900.351No
91021femaleHigh School12739.00.0OWN3.01600.0VENTURE14.740.131No
person_idperson_ageperson_genderperson_educationperson_incomeperson_emp_expperson_home_ownershipcb_person_cred_hist_lengthloan_amntloan_intentloan_int_rateloan_percent_incomeloan_statusprevious_loan_defaults_on_file
17999017999131maleMaster136832.09.0RENT7.012319.0PERSONAL16.920.091No
17999117999224maleHigh School37786.00.0MORTGAGE4.013500.0EDUCATION13.430.361No
17999217999323femaleBachelor40925.00.0RENT4.09000.0PERSONAL11.010.221No
17999317999427femaleHigh School35512.04.0RENT5.05000.0PERSONAL15.830.141No
17999417999524femaleAssociate31924.02.0RENT4.012229.0MEDICAL10.700.381No
17999517999627maleAssociate47971.06.0RENT3.015000.0MEDICAL15.660.311No
17999617999737femaleAssociate65800.017.0RENT11.09000.0HOMEIMPROVEMENT14.070.141No
17999717999833maleAssociate56942.07.0RENT10.02771.0DEBTCONSOLIDATION10.020.051No
17999817999929maleBachelor33164.04.0RENT6.012000.0EDUCATION13.230.361No
17999918000024maleHigh School51609.01.0RENT3.06665.0DEBTCONSOLIDATION17.050.131No